psrccensus

The psrccensus package allows R users at PSRC to download, summarize, and visualize Census, ACS, and PUMS data on common PSRC Census geographies.

The three most important functions are:

We have one fancy function that let’s create a map of an ACS variable by tract.

To use the library psrccensus, you first will need to get an api key.

Set up your API key

The first time you run this code, you will need to set our Census API Key as an environment variable, if you haven’t done that before. After that you can just get it. This is the website to get a key: https://api.census.gov/data/key_signup.html. Once you run Sys.setenv on the Census API Key you will only need to run Sys.getenv.

library(psrccensus)
#Sys.setenv(CENSUS_API_KEY = 'PUT YOUR KEY HERE')
Sys.getenv("CENSUS_API_KEY")

Next you need to decide what tables you would like to download and summarize. This is the hardest part because you have find the correct table code, decide on geography, and which years.

Finding the table ID

One helpful website for finding the tables on a topic from ACS and Census is: https://censusreporter.org/

You can also search the api variable lists from ACS and Census.

Which variables are available in which year in ACS and Census?

Generally, ACS 1-year data are available down for geographies with populations of 65,000 or more. So you can easily get 1-year data for counties or the region, for example. Once you want to go down to the tract-level, 5-year ACS data is more appropriate. Decennial Census data is available down to the block level.

Get the data —————————————————–

The functions are documented on: https://psrc.github.io/psrccensus/reference/index.html

Get ACS Data

Suppose you wish to tabulate ACS one year data 2019 data for estimates of total people by race and ethnicity, as provided in table B03002 by county. You would use the following function call.

get_acs_recs(geography = 'county',
             table.names = c('B03002'),
             years=c(2019),
             acs.type = 'acs1')
## The 1-year ACS provides data for geographies with populations of 65,000 and greater.
## Getting data from the 2019 1-year ACS
## # A tibble: 105 x 11
##    GEOID name        state      variable   estimate   moe label concept census_geography
##    <chr> <chr>       <chr>      <chr>         <dbl> <dbl> <chr> <chr>   <chr>           
##  1 53033 King County Washington B03002_001  2252782    NA Esti~ HISPAN~ County          
##  2 53033 King County Washington B03002_002  2030140    NA Esti~ HISPAN~ County          
##  3 53033 King County Washington B03002_003  1302544  3208 Esti~ HISPAN~ County          
##  4 53033 King County Washington B03002_004   147822  4678 Esti~ HISPAN~ County          
##  5 53033 King County Washington B03002_005    13321  1990 Esti~ HISPAN~ County          
##  6 53033 King County Washington B03002_006   424590  7085 Esti~ HISPAN~ County          
##  7 53033 King County Washington B03002_007    15702  1831 Esti~ HISPAN~ County          
##  8 53033 King County Washington B03002_008     6574  3281 Esti~ HISPAN~ County          
##  9 53033 King County Washington B03002_009   119587  8804 Esti~ HISPAN~ County          
## 10 53033 King County Washington B03002_010     2639  1744 Esti~ HISPAN~ County          
## # ... with 95 more rows, and 2 more variables: acs_type <chr>, year <dbl>

Get Census data

To generate Decennial Census tables for housing units and total population by MSA, you would call:

get_decennial_recs(geography = 'msa',
                   table_codes = c("H001", "P001"),
                   year = 2010,
                   fips = c('42660', "28420"))
## Getting data from the 2010 decennial Census
## Loading SF1 variables for 2010 from table H001. To cache this dataset for faster access to Census tables in the future, run this function with `cache_table = TRUE`. You only need to do this once per Census dataset.
## Using Census Summary File 1
## Getting data from the 2010 decennial Census
## Loading SF1 variables for 2010 from table P001. To cache this dataset for faster access to Census tables in the future, run this function with `cache_table = TRUE`. You only need to do this once per Census dataset.
## Using Census Summary File 1
## # A tibble: 4 x 6
##   GEOID NAME                                    variable   value label concept  
##   <chr> <chr>                                   <chr>      <dbl> <chr> <chr>    
## 1 28420 Kennewick-Pasco-Richland, WA Metro Area H001001    93041 Total HOUSING ~
## 2 42660 Seattle-Tacoma-Bellevue, WA Metro Area  H001001  1463295 Total HOUSING ~
## 3 28420 Kennewick-Pasco-Richland, WA Metro Area P001001   253340 Total TOTAL PO~
## 4 42660 Seattle-Tacoma-Bellevue, WA Metro Area  P001001  3439809 Total TOTAL PO~

Make a map of ACS data by tract

Let’s say you want to create a map of the tracts in the region for one variable. You can use the function create_tract_map. Here’s any example, mapping non-Hispanic Black or African American population alone by tract:

## Warning: package 'sf' was built under R version 4.0.5
## Linking to GEOS 3.9.0, GDAL 3.2.1, PROJ 7.2.1
## Warning: package 'dplyr' was built under R version 4.0.5
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
tract.big.tbl <- psrccensus::get_acs_recs(geography='tract',table.names=c('B03002'),years=c(2019))
## Getting data from the 2015-2019 5-year ACS
tract.tbl<-tract.big.tbl %>% filter(label=='Estimate!!Total:!!Not Hispanic or Latino:!!Black or African American alone')
gdb.nm = paste0("MSSQL:server=","AWS-PROD-SQL\\Sockeye",
";database=","ElmerGeo",";trusted_connection=yes")
spn = 2285
tract_layer_name="dbo.tract2010_nowater"
tract.lyr <- st_read(gdb.nm, tract_layer_name, crs = spn)
## Reading layer `dbo.tract2010_nowater' from data source 
##   `MSSQL:server=AWS-PROD-SQL\Sockeye;database=ElmerGeo;trusted_connection=yes' 
##   using driver `MSSQLSpatial'
## Simple feature collection with 773 features and 19 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 1099353 ymin: -97548.53 xmax: 1622631 ymax: 477101.5
## Projected CRS: NAD83 / Washington North (ftUS)
create_tract_map(tract.tbl, tract.lyr, map.title='Black, non-Hispanic Population',
map.title.position='topleft', legend.title='Black, Non-Hispanic Population',
legend.subtitle='by Census Tract')